Generating random networks with given degree-degree correlations and 

degree-dependent clustering 
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Random networks are widely used to model complex networks and research their properties. In 
order to get a good approximation of complex networks encountered in various disciplines of sci- 
ence, the ability to tune various statistical properties of random networks is very important. In this 
manuscript we present an algorithm which is able to construct arbitrarily degree-degree correlated 
networks with adjustable degree-dependent clustering. We verify the algorithm by using empiri- 
cal networks as input and describe additionally a simple way to fix a degree-dependent clustering 
function if degree-degree correlations are given. 

PACS numbers: 89.75.Hc, 05.40.-a 



I. INTRODUCTION 

Modeling empirical networks as random networks is 
an important approach in the effort of studying topol- 
ogy and dynamics of complex networks. The first at- 
tempts in constructing random networks which exhibit 
some of the common features regularly found in empir- 
ical networks from fields as different as biology, social 
sciences, and technology have mostly aimed at under- 
standing the origin of scale-free degree distributions (the 
degree of a vertex being its number of connections) and 
small average distances among vertices [l|, HJ. However, 
it has been found that there are other important statisti- 
cal quantities that profoundly influence the structure of 
complex networks and consequently the dynamics taking 
place on them. Notably among them are degree-degree 
correlations of vertices [3, 0, H| and the abundance of 
motifs @, 0, The smallest and probably most im- 
portant motif in undirected graphs is the triangle. Its 
abundance is called clustering and several measures have 
been proposed to quantify it 0]. Some refined network 
growing mechanisms which ext end the preferential at- 
tachment scheme introduced by iBarabasi and Albert! to 
generate "scale-free" graphs [l| that are either correlated 
or clustered have been proposed [l(| EH, E2, EH • Those 
algorithms are, however, restricted in the correlation and 
clustering patterns they are able to produce. 

Therefore, some efforts have recently been undertaken 
to overcome these restrictions. For example, the v ery suc- 
cessful configuration model (CM) algorithm EH, EH , 
capable of generating random networks with an a pri- 
ori given degree distribution, has been extended to in- 
clude either degree -degree correlations or clustering prop- 
erties of networks. ISerrano and Bogunal presented an al- 
gorithm capable of tuning the degree-dependent cluster- 
ing coefficient as well as the degree distribution [13. Ad- 
ditionally, they pointed out that clustering and degree- 
degree correlation are deeply entwined, the latter limiting 
the former especially for vertices of high degree, in par- 
ticular for disassortative networks where vertices of high 
degree are preferentially connected to vertices of low de- 
gree and vice versa. As both properties, clustering and 
correlations, are very important for the structure of a 
network and strongly related to each other, it is a natu- 



ral ansatz to control degree-dependent clustering and the 
correlation pattern simultaneously to achieve better null 
models of complex networks. In this manuscript, we pro- 
pose an algorithm to construct random networks with 
given degree-degree correlation structure and degree- 
dependent clustering. It is organized as follows: Section 
II introduces the network clustering and correlation mea- 
sures used. Section III describes the algorithm to con- 
struct degree-degree correlated and clustered networks 
and verifies our scheme by applying it to empirical net- 
works. Section IV presents a simple way to create net- 
works with certain correlations and clustering and shows 
some results of this approach. In Section V we briefly 
summarize. 



II. NETWORK CORRELATION AND 
CLUSTERING MEASURES 

Two-point degree-degree correlations can statistically 
be described via a degre-degree correlation function 
P(j, k) which is the probability that a randomly chosen 
edge has vertices of degrees j and k at its ends. In the 
case of uncorrelated networks, the correlation function 
factorizes into P u (j,k) = kP(k)jP(j)/{k) 2 , where P(k) 
is the degree distribution. Thus it appears natural to 
define a correlation function f(j, k) as 



f(j,k) 



Pti, k) 



(1) 



Values of f(j, k) different from 1 signal degree-degree cor- 
relations in the underlying network. A simpler but more 
coarse-grained manner to quantify degree-degree correla- 
tions is the average nearest neighbor function fc nn (/c), de- 
scribing the average degree of neighbors of vertices with 
degree k. It can be calculated from the conditional prob- 
ability P(j\k) = P(j, k) (k) /[kP(k)] as 



k an (k)=J2jP(j\k). 



(2) 



A network with an (de-)increasing k nn (k) is called (dis-) 
assortatively correlated. 
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Clustering was originally defined by Watts and Stro- 
gatz Q for the vertex i to be 



2T,; 



ki {k% 



1 



(3) 



where 2$ denotes the number of triangles passing through 
vertex i. Clearly this measure is a three-point dependent 
value as the number of triangles requires knowledge over 
three connected vertices at the same time. However, it 
is common use to average the clustering coefficients Cj of 
all vertices with the same degree k together, yielding a 
degree-dependent clustering coefficient 



c(k) 



1 



k(k - l)P(k)N 



ier(fc) 



2T, 



(4) 



w here T(fc) denotes the set of vertices with degree k. 

ISerrano and Bogunal pointed out that the degree- 
dependent clustering c(k) is restricted by degree-degree 
correlations and is often found to be a decreasing func- 
tion of k They calculated an upper limit A(fc) of 
c(fc) dependent on the degree-degree correlation function 
P(j,k). The main reasoning is that an edge cannot be 
part of more triangles than min(ki, kj) — 1 with fcj and 
kj being the degrees of the vertices connected by it. This 
results in a constraint on the number of triangles Ti for 
any vertex i, 



Ti < 
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aij[mia(ki,kj) - 1] 



(5) 



Here is the network's adjacency matrix. The upper 
limit A(fc) of the degree-dependent clustering c(k) can 
than be written as 



i k 

— -Y^{k-j)p{m><k) 



(6) 



i=i 



This function is always a decreasing function of k and 
its slope depends strongly on the average neighbor de- 
gree k nn (k). This means that degree-dependent cluster- 
ing c(k) can be written as 



c(k) = c c g(k)X(k) 



(7) 



with < c ff(fc) < lVfc. Thus c e g(k) can be regarded 
as an effective degree-dependent clustering, once degree- 
degree correlations are fixed. 

In the following, we describe an algorithm that is able 
to control the two quantities P(j, k) and c(fc) (or c c ff (k)) 
simultaneously. 



III. ALGORITHM 

As already stated, there exists an algorithm to create 
networks with a given degree distribution and a g iven 
level of clustering published bv lSerrano and Bogunal 17j |. 
We incorporated some of their basic ideas into our ap- 
proach which additionally fixes the degree-degree corre- 
lations besides the degree-dependent clustering. 



The overall scheme of the algorithm to construct a net- 
work with N vertices and a given Pd(j,k) and c d (k), 
Pd(j,k) being the number of connections between ver- 
tices with degrees k and j (double that number if k = j), 
and Cd(k) being the number of triangle edges constituted 
by vertices with degree k, is the following: 

We begin by assigning a number of stubs (the tar- 
get degree) to every vertex according to the degree dis- 
tribution Pd(k), which is calculated from Pd(j,k) as 

p d (k) = j: j Pd(j,k)/k. 

The next step is to get a list of degrees of triangle- 
corners, which shall contain c<j(fc) entries with value k. 
We also get a copy P' d of Pd(j, k) and c' d of Cd(k), which 
are dynamical quantities in the sense that these shall be 
decreased with every connection and triangle build. Thus 
for every connection built we decrease the appropriate 
entry in the P d (j,k) matrix by 1 and for every triangle 
built (for every connection we place, we check for simul- 
taneous neighbors of the involved vertices as any shared 
neighbor accounts for a new triangle built) to delete one 
entry from the triangle list and to decrease c' d (k) by 1 for 
every degree involved. 

Then we start to build all triangles in the triangle list 
one by one. Let vi be the vertices involved and ki their 
target degree. 

1. We draw a random entry k\ from the triangle list 
and draw a corresponding vertex v± with at least 
one free stub. If we cannot find such a vertex, we 
delete all entries with value k\ from the triangle list 
and start again. 

2. Now, we choose with uniform probability either (a) 
an edge or (b) a stub of vertex v\ out of a list 
created by omitting all edges for whose end vertex 
no more triangles can be build (i.e. c' d (k) =0). In 
case of (a), we have chosen an edge and the end 
vertex is v-i- If we have drawn a stub (b), we get 
a vertex vi in the same manner as we got vertex 
v\ with the further condition P' d (ki : k2) > 0. If it 
is not possible to find a ki fulfilling this condition, 
we delete all entries with value A'i from the triangle 
list and start again. 

3. Next, we draw (a) an edge or (b) a stub of vertex 
V2 from a list like we did in the preceding step for 
vertex v\ , but with edges inserted into the list only 
if they are fulfilling the supplementary condition of 
P d {ki, k$) > or vertex V3 being connected to ver- 
tex vi and vertices Vi, and U3 not already con- 
stituting a triangle. Having drawn an edge (a), we 
close the triangle by adding the missing edges and 
updating all dynamic quantities. Having drawn a 
stub (b), we choose a ^3 from the triangle list con- 
sistent with fci and k^. It might happen that this 
is not possible and we start again. When we got 
a &3, we draw a vertex U3 which either has enough 
free stubs or is already connected to vertex v\ or 
i>2, add the missing edges, and update all dynamic 
quantities. 

Note that in steps 2 and 3 the case of two or three degrees 
being the same has to be properly taken into account in 
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order not to build too much triangles or connections, and 
that self-connections are forbidden. 

Those steps are repeated until we cannot build any 
triangles anymore. This point may be defined by a max- 
imum number of successive tries that did not result in a 
triangle being built or until the triangle list is empty. 

Afterwards we build the rest of the graph by randomly 
choosing edges out of the remaining edge list, which con- 
tains P^(fci,/c 2 ) entries (fci,fc 2 ) for all degrees k 1: k 2 - We 
choose randomly two non-identical vertices with stubs 
left and build the edges (if the vertices are not already 
connected) and delete the edge we chose from the edge 
list. We repeat this until the edge list is empty or we can- 
not find any vertices which still lack connections and are 
not already connected to each other. If there are edges 
we could not build (typically there is no edge left, and 
very seldomly there are more than one or two edges left), 
we substitute them by randomly connecting vertices. 

To validate our algorithm, we use two empiri- 
cal networks as test cases: (i) the yeast protein- 
interaction network (PIN) constituent of 1,846 pro- 
teins 18] downloaded from Barabasi's web site 
http : //www . nd . edu/ "^networks/ , (ii) a subset of the in- 
ternet on the autonomous system level (AS) with 10, 515 
vertices (snapshot taken on 03/16/2001) taken from 
http : //www, cosin . org/| All self- and multiple-edges 
were removed from each network. To test the validity of 
the algorithm, one measures the joint degree distribution 
P(j, k) and the degree-dependent clustering c(k) of the 
empirical networks and uses these functions as input for 
the construction algorithm. The resulting random net- 
work has to display the same joint degree distribution 
P(ji k) (this implies that the degree distribution P(k) 
is met as well) and the same degree- dependent clustering 
c(k) as the empirical one. A very sensitive test to validate 
if the correlation structure of the reference and the ran- 
dom network indeed match is on the level of the correla- 
tion function /(j, k), which varies on a much smaller scale 
than the joint degree distribution P(j, k). Thus, compar- 
ing the reference correlation function / rc f (j, k) with the 
resulting correlation function f(j, k) by use of a correla- 
tion coefficient (1 means total agreement, —1 indicates 
that the two functions are of opposite sign and means 
no correlation among the two functions in comparison) 
reveals almost complete agreement of (i) 0.9999(9) and 
(ii) 0.999(7). A density plot of the reference function 
versus the resulting correlation function in Fig. [1] veri- 
fies the excellent agreement of the correlation functions 
f(j,k) and f Te f(j,k), as the density of points is almost 
solely centered at the diagonal. The statistics per curve 
are 10 3 realizations for the AS network and 10 4 for the 
PIN. 

However, the main and new point of our algorithm is 
its ability to conserve the degree-dependent clustering as 
well. The quality of agreement is shown in Fig. [2 We 
show a comparison between the degree-dependent clus- 
tering c(k) of empirical and generated networks. One can 
see that the level of clustering in the PIN and AS network 
is well reproduced. 
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FIG. 1: (color online) Plot of the correlation function f(j, k) 
of the random networks generated by the present algorithm 
vs the correlation function f Ic f(j, k) of the corresponding em- 
pirical network. The data is presented as a density plot. 
Darker red regions contain a higher density of data points, 
while brighter red indicates a lower density. A reference line 
y — x is drawn as a guide to the eye. 
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FIG. 2: (color online) Degree-dependent clustering coefficient 
c(k) vs k of empirical graphs (open, red triangles) compared to 
their randomized versions generated by the present algorithm 
(full, green circles). 



IV. CORRELATIONS AND CLUSTERING 

We wish not only to be able to reproduce correlations 
and clustering in empirical networks, but also to create 
graphs from scratch that follow an adjustable correla- 
tion pattern expressed by the average nearest neighbor 
function k Dn (k) and a tunable degree-dependent cluster- 
ing coefficient c(fc). Eq. ([7]) defines c c g(k) as an effective 
clustering. So we might consider a graph showing 



off(fc) = /i ■ 



(8) 



with [i being a constant between and 1, as an equally 
clustered graph throughout all degree classes. Therefore 
we may tune the level of clustering by changing fi. As 
we are able to control degree-degree correlations by use 
of the algorithm presented in [19| , we can calculate the 
upper limit A(fc) and therefore the degree-dependent clus- 
tering c(k) from P(J,k) and the target clustering c c s(k) 
via Eqs. © and ([7]). To get a discrete correlation func- 
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FIG. 3: Clustering c c ff(fc) = c(k)/\(k) vs k examplified for 
N = 10 5 , 7 = 2.8, fcnn(fc) a s given in the main text, and 
H = 0, 0.1, 0.2, 0.3, and 0.4 (from bottom to top), (a) a = 0.2 
(assortative), (b) a = 0, and (c) a = —0.2 (disassortative) . 



an example. The graph size has been set to TV = 10 s 
vertices. In order to avoid intrinsic degree-degree corre- 
lations caused b y th e constraint of no self- and multiple 
connections [l^.]20|. one has to limit the maximum de- 
gree to a & max depending on the scale-free exponent 7 
and the level of (dis-)assortativity controlled by a [19 ]. 

In Fig. [3] we show the resulting c c s(k). The statistics 
per curve are 100 realizations each, with Pd(j, k) drawn 
for each realization seperately. One observes that a level 
of clustering close to fx = 1 is not achievable with our 
algorithm. Low levels of clustering are very well repro- 
ducible, and medium levels of clustering are very well 
reproducible for lower degrees, but the higher the degree 
the more difficult it gets to cross a certain level of cluster- 
ing, this level being dependent of the level of clustering of 
the lower degree classes. This behavior is not surprising 
as in calculating the upper bound A(fc) it is assumed that 
all vertices i with a degree k' smaller than k have a clus- 
tering coefficient a = 1. Thus restrictions on the level 
of clustering of low degree vertices imply stronger restric- 
tions on the level of clustering of high degree vertices. As 
changing the assortativity via a has only a minor effect 
on the effective clustering c c g(fc) which can be reached, 
it seems that the effects of degree-degree correlations on 
clustering are well described by the upper bound A(fc). 



tion Pd(j, k), we first create a graph with a given degree 
distribution P(k) and a correlation structure character- 
ized by a given k nn (k) using the method presented in 
(l9| . and obtain its discrete correlation function Pd{j, k). 
With Eq. ((7]) we get c(k) and therefore the number of 
triangles per degree k as 

c d (k)=c(k)P{k)(k-l)N. (9) 

With Cd(k), Pd(j, k) and the resulting discretisized Pd{k) 
we have the input needed for our algorithm. To validate 
the algorithm we tested it for a scale-free graph (P(k) oc 
fc -7 ) with several levels of clustering and several degrees 

of assortativity using k nn (k) oc exp I (ln(l + j^—)) a ) as 



V. CONCLUSION 

In summary, we have presented an algorithm which 
generates networks with an a priori fixed degree-degree 
correlation structure defined by the joint degree distribu- 
tion P(j, k) and an adjustable level of clustering defined 
by the degree-dependent clustering coefficient c(k). As 
clustering and degree-degree correlations are suspected 
to play an important role in many dynamical processes 
taking place on networks, our algorithm may provide a 
very useful tool to systematically research the influences 
of those topological properties on different dynamics. 
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